翻訳と辞書 |
correlation clustering : ウィキペディア英語版 | correlation clustering Clustering is the problem of partitioning data points into groups based on their similarity. Correlation clustering provides a method for clustering a set of objects into the optimum number of clusters without specifying that number in advance.〔Becker, Hila, ("A Survey of Correlation Clustering", 5 May 2005 )〕 ==Description of the problem== (詳細はmachine learning, correlation clustering or cluster editing operates in a scenario where the relationships between the objects are known instead of the actual representations of the objects. For example, given a weighted graph where the edge weight indicates whether two nodes are similar (positive edge weight) or different (negative edge weight), the task is to find a clustering that either maximizes agreements (sum of positive edge weights within a cluster plus the absolute value of the sum of negative edge weights between clusters) or minimizes disagreements (absolute value of the sum of negative edge weights within a cluster plus the sum of positive edge weights across clusters). Unlike other clustering algorithms this does not require choosing the number of clusters in advance because the objective, to minimize the sum of weights of the cut edges, is independent of the number of clusters. It may not be possible to find a perfect clustering, where all similar items are in a cluster while all dissimilar ones are in different clusters. If the graph indeed admits a perfect clustering, then simply deleting all the negative edges and finding the connected components in the remaining graph will return the required clusters. But, in general a graph may not have a perfect clustering. For example, given nodes ''a,b,c'' such that ''a,b'' and ''a,c'' are similar while ''b,c'' are dissimilar, a perfect clustering is not possible. In such cases, the task is to find a clustering that maximizes the number of agreements (number of + edges inside clusters minus the number of - edges between clusters) or minimizes the number of disagreements (the number of - edges inside clusters minus the number of + edges between clusters). This problem of maximizing the agreements is NP-complete (multiway cut problem reduces to maximizing weighted agreements and the problem of partitioning into triangles can be reduced to the unweighted version).
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「correlation clustering」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|